AITopics | candidate choice

Collaborating Authors

candidate choice

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Defining and Evaluating Decision and Composite Risk in Language Models Applied to Natural Language Inference

Shen, Ke, Kejriwal, Mayank

arXiv.org Artificial IntelligenceAug-4-2024

Despite their impressive performance, large language models (LLMs) such as ChatGPT are known to pose important risks. One such set of risks arises from misplaced confidence, whether over-confidence or under-confidence, that the models have in their inference. While the former is well studied, the latter is not, leading to an asymmetry in understanding the comprehensive risk of the model based on misplaced confidence. In this paper, we address this asymmetry by defining two types of risk (decision and composite risk), and proposing an experimental framework consisting of a two-level inference architecture and appropriate metrics for measuring such risks in both discriminative and generative LLMs. The first level relies on a decision rule that determines whether the underlying language model should abstain from inference. The second level (which applies if the model does not abstain) is the model's inference. Detailed experiments on four natural language commonsense reasoning datasets using both an open-source ensemble-based RoBERTa model and ChatGPT, demonstrate the practical utility of the evaluation framework. For example, our results show that our framework can get an LLM to confidently respond to an extra 20.1% of low-risk inference tasks that other methods might misclassify as high-risk, and skip 19.8% of high-risk tasks, which would have been answered incorrectly.

benchmark, composite risk, decision rule, (16 more...)

arXiv.org Artificial Intelligence

2408.01935

Country:

North America > United States > California (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Wisconsin (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Computational-level Analysis of Constraint Compliance for General Intelligence

Wray, Robert E., Jones, Steven J., Laird, John E.

arXiv.org Artificial IntelligenceJun-15-2023

Human behavior is conditioned by codes and norms that constrain action. Rules, ``manners,'' laws, and moral imperatives are examples of classes of constraints that govern human behavior. These systems of constraints are "messy:" individual constraints are often poorly defined, what constraints are relevant in a particular situation may be unknown or ambiguous, constraints interact and conflict with one another, and determining how to act within the bounds of the relevant constraints may be a significant challenge, especially when rapid decisions are needed. Despite such messiness, humans incorporate constraints in their decisions robustly and rapidly. General, artificially-intelligent agents must also be able to navigate the messiness of systems of real-world constraints in order to behave predictability and reliably. In this paper, we characterize sources of complexity in constraint processing for general agents and describe a computational-level analysis for such constraint compliance. We identify key algorithmic requirements based on the computational-level analysis and outline an initial, exploratory implementation of a general approach to constraint compliance.

artificial intelligence, constraint, constraint-based reasoning, (15 more...)

arXiv.org Artificial Intelligence

2303.04352

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > New York (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry:

Transportation > Ground > Road (0.94)
Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Choice Fusion as Knowledge for Zero-Shot Dialogue State Tracking

Su, Ruolin, Yang, Jingfeng, Wu, Ting-Wei, Juang, Biing-Hwang

arXiv.org Artificial IntelligenceFeb-25-2023

Nowadays, the requirements of deploying an increasing number of services across a variety of domains raise challenges With the demanding need for deploying dialogue systems in to DST models in production [4]. However, existing new domains with less cost, zero-shot dialogue state tracking dialogue datasets only span a few domains, making it impossible (DST), which tracks user's requirements in task-oriented dialogues to train a DST model upon all conceivable conversation without training on desired domains, draws attention flows [5]. Furthermore, dialogue systems are required to infer increasingly. Although prior works have leveraged questionanswering dialogue states with dynamic techniques and offer diverse (QA) data to reduce the need for in-domain training interfaces for different services. Despite the fact that the copy in DST, they fail to explicitly model knowledge transfer mechanism [6] or dialogue acts [7] are leveraged to efficiently and fusion for tracking dialogue states. To address this issue, track slots and values in the dialogue history, the performance we propose CoFunDST, which is trained on domain-agnostic of DST still relies on a large number of annotations of dialogue QA datasets and directly uses candidate choices of slot-values states, which is expensive and inefficient to collect data as knowledge for zero-shot dialogue-state generation, based for every new domain and service.

artificial intelligence, candidate choice, natural language, (15 more...)

arXiv.org Artificial Intelligence

2302.13013

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Understanding Prior Bias and Choice Paralysis in Transformer-based Language Representation Models through Four Experimental Probes

Shen, Ke, Kejriwal, Mayank

arXiv.org Artificial IntelligenceOct-3-2022

Recent work on transformer-based neural networks has led to impressive advances on multiple-choice natural language understanding (NLU) problems, such as Question Answering (QA) and abductive reasoning. Despite these advances, there is limited work still on understanding whether these models respond to perturbed multiple-choice instances in a sufficiently robust manner that would allow them to be trusted in real-world situations. We present four confusion probes, inspired by similar phenomena first identified in the behavioral science community, to test for problems such as prior bias and choice paralysis. Experimentally, we probe a widely used transformer-based multiple-choice NLU system using four established benchmark datasets. Here we show that the model exhibits significant prior bias and to a lesser, but still highly significant degree, choice paralysis, in addition to other problems. Our results suggest that stronger testing protocols and additional benchmarks may be necessary before the language models are used in front-facing systems or decision making with real world consequences.

benchmark, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2210.01258

Country:

North America > United States > California (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.89)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fuzzy Forests For Feature Selection in High-Dimensional Survey Data: An Application to the 2020 U.S. Presidential Election

Dey, Sreemanti, Alvarez, R. Michael

arXiv.org Machine LearningMar-5-2022

An increasingly common methodological issue in the field of social science is high-dimensional and highly correlated datasets that are unamenable to the traditional deductive framework of study. Analysis of candidate choice in the 2020 Presidential Election is one area in which this issue presents itself: in order to test the many theories explaining the outcome of the election, it is necessary to use data such as the 2020 Cooperative Election Study Common Content, with hundreds of highly correlated features. We present the Fuzzy Forests algorithm, a variant of the popular Random Forests ensemble method, as an efficient way to reduce the feature space in such cases with minimal bias, while also maintaining predictive performance on par with common algorithms like Random Forests and logit. Using Fuzzy Forests, we isolate the top correlates of candidate choice and find that partisan polarization was the strongest factor driving the 2020 presidential election. Social science research today often encounters a difficult methodological situation -- larger and larger datasets, which contain high-dimensional features, which are highly correlated [7]. Quite literally, as in the application we discuss in our paper (the 2020 U.S Presidential election), to test the many different theories and potential explanations for why voters decided to remove then President Trump from office, researchers need to use methodologies that can quickly and efficiently reduce the feature space from hundreds of possible features to a smaller set that can then be the focus of further study. In our paper we present a variant of the popular Random Forest, Fuzzy Forests, which we argue is well suited for exactly this type of applied machine learning problem [6]. Fuzzy Forests are ideal for feature selection in large and high-dimensional datasets, where the features are highly correlated.

candidate choice, dataset, fuzzy forest, (15 more...)

arXiv.org Machine Learning

2203.02818

Country:

North America > United States > California (0.04)
Asia > Middle East > Iran (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.78)

Add feedback

Exploiting Sentence Embedding for Medical Question Answering

Hao, Yu, Liu, Xien, Wu, Ji, Lv, Ping

arXiv.org Artificial IntelligenceNov-14-2018

Despite the great success of word embedding, sentence embedding remains a not-well-solved problem. In this paper, we present a supervised learning framework to exploit sentence embedding for the medical question answering task. The learning framework consists of two main parts: 1) a sentence embedding producing module, and 2) a scoring module. The former is developed with contextual self-attention and multi-scale techniques to encode a sentence into an embedding tensor. This module is shortly called Contextual self-Attention Multi-scale Sentence Embedding (CAMSE). The latter employs two scoring strategies: Semantic Matching Scoring (SMS) and Semantic Association Scoring (SAS). SMS measures similarity while SAS captures association between sentence pairs: a medical question concatenated with a candidate choice, and a piece of corresponding supportive evidence. The proposed framework is examined by two Medical Question Answering(MedicalQA) datasets which are collected from real-world applications: medical exam and clinical diagnosis based on electronic medical records (EMR). The comparison results show that our proposed framework achieved significant improvements compared to competitive baseline approaches. Additionally, a series of controlled experiments are also conducted to illustrate that the multi-scale strategy and the contextual self-attention layer play important roles for producing effective sentence embedding, and the two kinds of scoring strategies are highly complementary to each other for question answering problems.

machine learning, natural language, question answering, (19 more...)

arXiv.org Artificial Intelligence

1811.06156

Country: Asia > China (0.14)

Genre:

Research Report > Experimental Study (0.54)
Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.68)

Add feedback